Introduction

The accumulation of mismanaged plastic waste in the environment is a global growing concern. Knowing with precision where litter is generated is important to target priority areas for the implementation of new mitigation policies. In this project, using country-level data on waste management combined with world population distributions and long-term projections of population and the gross domestic product (GDP), we wanted to investigate whether or not GDP was the sole culprit in determining mismanaged waste per person per day. This is important because mismanaged waste is causing hundreds of thousands of people to die each year in the developing world from easily preventable causes, and plastic waste especially is adding a new and dangerous dimension to the problem.

For the purpose of this project we used three different data sets from an article on the Our World in Data site. The data is combined with gapminder world census data. The article explores the long-term impact of mismanaged litter and the chemical, ecological, behavioral, physical and health consequences of it.

The Data

This data was made available by Our World in Data, an online publication that focused on large-scale global issues, such as poverty, inequality, war, disease, and climate change. The data was gathered by researchers at Oxford University, and included 9 variables. Variables included country name, country code (or abbreviation), year, per capita GDP, per capita plastic waste, total mismanaged waste, per capita mismanaged waste, coastal population, and total population. This data was then merged with Gapminder data, which contained mapping information that allowed us to visualize plastic waste on a global scale.

The data sets each had 186 observations for each of the variables. It should be noted that data is not provided for some countries due to either practical or logistical challenges in gathering data in these areas. These countries include but not limited to Bolivia, Kazakhstan, Paraguay, Mongolia, Afghanistan and some central African countries.

Objective and Goals:

Our group’s objective for this project is to determine what the main predictors of mismanaged waste. We are hypothesizing that GDP per capita is the most indicative predictor of mismanaged waste per country. Additionally, we are hoping to find another good predictor in addition to GDP, giving more explanation into high mismanagement of waste.

Our Predictors

The plot above shows the relationship between GDP per capita and per capita mismanaged waste. We see a downward slope to the plot, however it does not appear perfectly linear. Additionally, this plot shows the densities of each variable on the axis. The log was taken of these due to the left skew in both.

Comparison of Plastic Waste to Mismanaged Waste

It’s obvious through this plot that Africa has a different relationship between GDP and mismanaged waste compared to the other continents. The relationship is slightly positive in Africa whereas it is negative in every other continent.

This faceted plot shows relationship between mismanaged waste per capita and GDP per capita by continent. In Asia there are two distinct groupings, which most probably shows the difference between southeast Asia or densely populated countries like China and the rest of Asia. We can also see that Oceania has some outlier points, which is simply due to the small data provided. For the purpose of this report, we kept these countries in our analysis. Also, the relationships of each continent appear to be high, which shows a strong relationship with continent as a predictor.

This plot above combines GDP and continent as predictors for mismanaged waste. This time, population was added, but no clear relationship can be seen with this variable.

For this plot, we created a coastal population percentage variable which took the coastal population variable and divided it by the total population. We are unsure why there are percentages above one, but the data matches this with a higher coastal population than total population. This would be something to look into for the future. This plot does, however, show that the percent coastal population variable has some effect, specifically in terms of continent.

Which countries are contributing most to plastic waste on Earth? How does this compare to the mismanagement of waste?

Map of countries showing log GDP

Map of countries showing Plastic Waste Per Capita vs. Per Capita Mismanaged Waste

In the maps above, we first just wanted to show how GDP is spread throughout the world. The second side by side plots aim to show how the plastic waste is distributed and how this compares and changes in the map for mismanagement of waste. As afore mentioned, the countries we do not have data for are indicated in grey. Judging by the neighboring countries, we can make an educated guess of how these countries might compare. We can see that higher income countries produce more plastic waste and seem to have good methods for managing this plastic waste in comparison to lower income countries. There is also higher waste mismanagement in Asia, which we attribute to disposal methods like those in the US, in which our plastic recycling was simply sent to China. This could, therefore, also account for the high mismanagement of waste in Asia. (Note the above maps are on a log scale.)

How does a country’s population relate to mismanaged waste?

In the plot above, we aimed to show any relationship that could exist between population and per capita mismanaged waste, which would therefore show that population is significant after it has already been accounted for. This plot shows us that this relationship does not exist as we had expected.

Mapping by Continents

With these models, a new Continent varibale was created because we also wanted to show the continental effect on each variable: GDP, mismanaged waste and per capita mismanaged waste. We wanted to show this distribution to show the harsh differences that exist between continents, pushing it forward as a good predictor.

Building Models:

The GAM Model

A generalized additive model (GAM) is a generalized linear model. The GAM model is weighted by total population. For our first model, to predict mismanaged waste per capita we used the interation between (log) GDP and Coastal Population.

We can see there is a very evident linear inverse relationship between (log) GDP per Capita and our fitted values. This tells us that most of the variation is explained in the model. Here, we did include an interaction term of Coastal Population. The slight variation could be due to this interaction.

For the two plots above we broke up Coastal Population into 3 different classifying groups: low, moderate and high coastal populations. We also created 5 different categories for GDP per capita to break up how we group GDP. In the first 3-panel plot, for each coastal population group we can see almost an identical monotonically decreasing relationship. This is most exaggerated in the highest Coastal Population areas. We also see the slightly negative relationship explained in the second model graph.

Overall our GAM model performs ok. Ideally we want an even spread of residuals around zero. The high residual in this plot is indicated by Trinedad. We see slight variation and a curvey pattern in the residuals which tells us that perhaps there are other factors that this model is not taking into acount.

The GAM Model with continents

This difference between our first and second model is now we have another predictor: percent coastal. We still included the interaction between (log) GDP and Continent.

Again, there is a pretty fair relationship, again negative indicated that larger GDP indicated more mismanaged waste. There is more variation than before which is due to the additional predictor variable.

Here we can really see how Africa differs from the other continents. As before we facetd our plots by continent and included the different coastal population groups (low, moderate, and high). We see a monotonic decreasing relationship for each continent expect Africa, it actually has the opposite relationship, a monotonically increasing relationship.

Again, as before we can now compare different levels of GDP across the continents. Surprisingly we see a lot of variation in Europe along with Oceania, and the Americas look almost identical to Asia. Here, however, we see a positive trend across each continent, which is a little more consistent than before.

Comparing our fitted values against the residuals, we actually see a slightly less curvey loess. Again the higher residuals represent central African countries including Trinidad and Ghana. We then faceted the larger plot above and broke it up to see the spread of residuals across the 5 continents. Africa unlike the others has a very slight positive relationship (but this is almost negligible…) There is much variation in Oceania probably due to the minimal data for this region. Europe is the most consistent and most tightly related plot in comparison to the Americas and Asia especially.

Conclusion

With our final model, we saw a good fit included GDP and its interaction with continents as well as the percent coastal population variable as an additive variable in the model. We then weighted this by total population. From the beginning, GDP was our main focus, understanding that a higher GDP would indicate a country has more industrialization and therefore produces more waste. Additionally, due to the differences in continents that we saw in our exploration, this should be added as an interaction term to provide an overall better explanation for the mismanagement of waste. Though it has its limits, we found that the percentage of coastal population variable was a meaningful predictor and should be included in the model.

We did have some shortcomings in terms of data, specifically that we were missing some countries and only had a few variables to use as predictors. It would be interesting to find data on the distance of coastline to replace the coastal population percentage. We also think finding data on the urban density or including the Gini index as a predictor would add a lot to the model.